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Evolutionary adaptation is a process that increases a population's 
"fit" to the world it inhabits, and is often likened to climbing a hill 
or peak. While this process is simple for fitness landscapes where 
each mutation provides an advantage (or disadvantage) to the or- 
ganism that is independent of the fitness effect of a mutation on 
a different locus, the interaction between mutations (epistasis) as 
well as mutations at loci that affect more than one trait (pleiotropy) 
are crucial in complex and realistic fitness landscapes. We inves- 
tigate the impact of epistasis and pleiotropy on adaptive evolution 
by simulating the evolution of a population of asexual haploid or- 
ganisms (haplotypes) in a model of N interacting loci, where each 
locus interacts with K other loci. We use a quantitative measure of 
the magnitude of epistatic interactions and find that it is a mono- 
tonically increasing function of K. When haplotypes adapt at high 
mutation rates, more epistatic pairs of substitutions are observed on 
the line of descent than expected, whereas at low mutation rates 
epistatic interactions are avoided. The highest fitness is attained 
in landscapes with an intermediate amount of ruggedness {K ^ 5) 
because while both the height of the global peak and the average 
selection coefficient per beneficial mutation increase with K, inter- 
acting with too many other loci leads to untraversable landscapes. 
Our findings imply that the synergism between loci that interact 
epistatically is crucial for evolving genetic modules with high fitness, 
while too much ruggedness stalls the adaptive process. 

Introduction 

As a population adapts to its environment, it accumulates 
mutations that increase the chance for the long-term suc- 
cess of the lineage (or lineages) it represents. The standard 
picture for this process is Fisher's geometric model [1] of evolu- 
tion by small steps, i.e., the accumulation of many mutations 
with small benefit. The evidence supporting this concept, 
however, is scarce [2], and many open questions remain [3]. 

More modern treatments use stochastic substitution mod- 
els [4-8] to understand the adaptation of DNA sequences. If 
the mutation rate is small and selection is strong, the adaptive 
process can explore at most one mutational step away from the 
wild type, so that mutations are fixed sequentially and delete- 
rious mutations only play a minor role (if any) [8]. However, 
if the rate of mutation is high and/or or selection is weak, 
or effective population sizes are large, mutations can interact 
significantly and adaptation does not proceed solely via the 
accumulation of only beneficial (and neutral) mutations (this 
is sometimes called the concurrent mutations regime [9,10]). 
In this regime, deleterious mutations may play an important 
role as stepping stones of adaptive evolution that allow a 
population to traverse fitness valleys. Kimura, for example, 
showed that a deleterious mutation can drift to fixation if fol- 
lowed by a compensatory mutation that restores fitness [11]. 
In the long-term evolution experiment with E. coli bacteria, 
it was shown that lineages with multiple mutations compete 
for thousands of generations before fixation [12], and a new 
adaptive trait emerged in one of the lines that required sev- 
eral mutations that appear to interact with each other [13]. 
Also, work using computational simulations of evolution has 
shown that deleterious mutations are crucial for adaption, and 



interact with subsequent mutations to create substantial bene- 
ficial effects [14-18]. Even though the potential of interacting 
mutations in adaptive evolution has been pointed out early 
by Zuckerkandl and Pauling [19], their importance in shaping 
adaptive paths through a fitness landscape has only recently 
come to the forefront [20-23] , and is still a topic of much dis- 
cussion [24-27]. In this respect, the impact of the sign (i.e., 
positive or negative) as well as the size of epistasis on adapta- 
tion, and how this impact is modulated by the mutation rate 
is, has not received the attention it deserves [28-30]. 

If we move from the single gene level to networks of genes, 
the situation becomes even more complex [31]. Gene net- 
works that have been explored experimentally are strongly 
epistatic [32-35] , and allelic changes at one locus significantly 
modulate the fitness effect of a mutation at another locus [36] . 
To understand the evolution of such systems, we have to take 
into account the interaction between loci, and furthermore 
abandon the limit where mutations on different loci fix sequen- 
tially. Because a comprehensive theory of adaptation in the 
concurrent mutations limit, with arbitrary mutational effects 
and epistasis does not currently exist, we quantify the impact 
of epistasis on evolutionary adaptation (and the dependence 
of this impact on mutation rate) by studying a computational 
model of a fitness landscape of N loci, whose ruggedness can 
be tuned: the NK landscape model of Kauffman [37-39]. The 
model (and versions of it known as the "blocks model") has 
been used to study a variety of problems in evolution (see, 
e.g., [38,40-45]), but most concern the evolution of beneficial 
alleles at a single locus. Models that study interacting gene 
networks (for example, transcriptional regulatory networks) 
have focused mainly on the topology, robustness, and mod- 
ularity of the network [46-49]. Instead, we are interested in 
the evolution of the allelic states of the network as a popula- 
tion evolves from low fitness to high fitness: how interacting 
mutations allow the crossing of fitness valleys, and how the 
ruggedness of the landscape shapes the evolutionary path. 
Valley crossing for asexual populations has been studied in 
detail before [50], focusing on the probability that a valley 
can be crossed rather than the impact of genetic interactions. 

As opposed to most work studying adaptation in the NK 
fitness landscape, we do not focus on population observables 
such as mean fitness, but rather study the line of descent 
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in each population in order to characterize the sequence and 
distribution of mutations that have come to represent the evo- 
lutionary path (see, e.g., [14,17]). These mutations are not 
independent of each other in general, and paint a complex 
picture of adaptation that involves deleterious and beneficial 
mutations that are conditional on the presence of each other 
and other alleles on the hap lo type, of valley crossings, com- 
pensatory mutations, and reversals. 




Fig. 1. NK-model haplotypes for N=16 and K=2. For these parameters, the 
fitness contribution of each locus is determined by interacting with 2 loci (adjacent 
in the representation shown here), giving rise to blocks of 2K+1 interacting genes. 
(A): Interactions between loci represented by lines. (B): Actual epistatic interac- 
tions on a particular high-fitness peaks, where the width of the lines indicates the 
strength of epistatic interactions (defined below). Three modules of interacting loci 
are colored. The remaining interactions (dashed grey lines) are weak. 



NK model 

The NK model of genetic interactions [37-39] is defined by cir- 
cular, binary sequences encoding the alleles at N loci, where 
each locus contributes to the fitness of the haplotype via an 
interaction with K other loci. For each of the N loci, we 
create a lookup-table with random numbers between and 1 
(drawn from a uniform distribution) that represent the fitness 
contribution Wi of a binary sequence of length K -\- 1. For 
example, the case K — 1 (interaction with one other locus) 
is modeled by creating random numbers for the four possible 
binary pairs 00,01,10,11 for each of the N loci, that is, the 
fitness contribution at one locus is conditional on the allele 
at one other adjacent locus. Whether or not interacting loci 
are adjacent is immaterial for asexual reproduction [43], but 
is crucial if recombination were to take place. Because in- 
dependent random numbers are drawn for the four different 
combinations, the fitness contribution of a locus to the overall 
fitness of the organism can change drastically depending on 
the allele of the interacting locus. The case K = (no inter- 
acting loci, and therefore vanishing epistasis) is the simplest. 
This choice gives rise to a smooth landscape with only a sin- 
gle peak that any search algorithm can locate in linear time, 
whereas increasing K makes the fitness of a locus dependent 
on a total of + 1 loci, resulting in a rugged landscape with 
multiple local peaks. At the same time, the fitness oi K -\- 1 
loci is affected by a single mutation, giving rise to pervasive 
pleiotropy that amplifies the ruggedness of the landscape by 
increasing the effect of single mutations. In Fig. lA, we show 
an example haplotype with N = 16 and X = 2, indicating the 
potential interactions. For high- fitness haplotypes, some in- 
teractions are stronger than others (Fig. IB), and lead to the 
formation of clusters of strongly interacting loci (modules). 

Here, we choose the fitness or each haplotype to be the 
geometric mean of the values Wi found in the lookup-tables. 
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rather than the average as is done traditionally [39]. This 
form is more realistic than its additive counterpart and has 
been suggested before [44,51]. In such a landscape, single 
mutations can potentially have a large effect on fitness, in- 
cluding lethality. The landscape defined by Eq. (1) gives rise 
to very few neutral mutations because each locus contributes 
to fitness in one way or the other, and we have not explicitly 
introduced alleles with zero fitness (lethals). We note that the 
results presented here do not depend on whether fitness is the 
arithmetic or the geometric mean. 

Quantifying Epistasis Two mutations (A and B) occurring on 
a haplotype with wild-type fitness Wo are said to be indepen- 
dent if the fitness effect of the joint mutation Wab /Wq equals 
the product of the fitness effect of each of the mutations alone, 
that is, Wa/Wo and Wb/Wq (see illustration in Fig. 2) 
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[2] 



We quantify epistasis as the deviation from this equality, such 
that 

1 WqWab .Ql 

is zero when the combined effect of the two mutations is the 
same as the product of the individual effects on fitness. This 
definition is equivalent to the usual quantitative definition of 
epistasis in a two-locus two- allele model (cf. [52], but see [21] 
for a different definition) and transforms to the well-known 
additive definition of epistasis when the individual fitness ef- 
fects are replaced by their logarithms (see, e.g., [53]). Such 
a quantitative measure of epistasis was also used in assessing 
epistasis between mutations in experiments with E. coli [54] 
and digital organisms [55]. For organisms on the line of de- 
scent (LOD, see Methods) of an evolutionary run, A and B 
refer to two substitutions that need not be adjacent either on 
the LOD or on the haplotype. We do see examples of non- 
consecutive mutations interacting, such as when a valley is 
crossed in more than one step (e.g., in Fig. 3, K — A)^ but 
here we restrict ourselves to studying the interaction between 
adjacent mutations on the LOD only, so that if Wab is the 
fitness of the haplotype that has both substitutions A and B, 
then the type preceding this sequence on the LOD has fit- 
ness Wa- Wb is found by reverting the first substitution (A), 
and measuring the fitness of the haplotype carrying only the 
second mutation (B). 




[1] 



Fig. 2. Schematic representation of epistasis. Two mutations A and B can inter- 
act epistatically in different ways with varying effects on fitness. The fitness of the 
wild-type is represented by the black baselines, and the heights of arrows represent the 
fitness after one mutation [Wa or Wb) and after both mutations {Wab)- Green: 
positive epistasis, red: negative epistasis. In (A), two independently beneficial mu- 
tations may have their joint effect increased or diminished {Wab larger or smaller), 
while in (B) the independent effect of the two mutations is deleterious and beneficial, 
respectively, and the combined expected effect on fitness is deleterious. In (C), each 
mutation by itself is deleterious, but when they interact the result can be reciprocal 
sign epistasis (green arrow). These sketches illustrate an additive model, where the 
sum of Wa and Wb is equal to Wab without epistasis. In our model using the 
geometric mean this corresponds to taking the logarithms of the fitness. 
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Results 

We studied the impact of epistasis on adaptation by conduct- 
ing evolutionary runs with different K (which changes the 
landscape's ruggedness) and a fixed number of A/" = 20 loci, 
for different mutation rates /x at a constant population size J\f 
of 5,000 individuals. In order to study only that part of evo- 
lutionary adaptation where a population climbs a local peak, 
each evolutionary run was initiated with a random selection 
of haplotypes of less than average fitness, akin to experiments 
with RNA viruses that are forced through bottlenecks [2,56], 
or are subject to environmental change [57]. The evolution- 
ary dynamics of each run were similar in most cases: the 
population quickly adapts and situates itself near the top of a 
local peak, after which the population enters a period of stasis 
when exploration of the adjacent parts of the landscape does 
not turn up any more beneficial mutations (Fig. 3). This pro- 
tocol is different (in terms of adaptation) from experiments in 
which only deleterious effects of mutations are studied, and 
advantageous mutations are found to be rare [58]. Thus, in 
this work we study the transient period of adaptation as op- 
posed to mutation-selection balance. Initiating populations 
with a single genotype only does not change the evolutionary 
dynamics we observe. 
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Fig. 3. Representative examples of adaptation in single lineages. Fitness on the 
line of descent for a simulation lasting 2,000 updates. The adaptive ascent is only 
shown until the lineage has attained the same fitness as it has after 2,000 updates, 
for Ar=20, K=0 (purple) and K=4 (blue), at a high mutation rate At=10~^. The 
inset shows an example line of descent at /x=10~^, with only beneficial mutations 
on the LOD. 



Epistatic pairs on the LOD The number of epistatic pairs of 
loci in a population increases as we increase because there 
will be epistasis whenever two mutated loci are located within 
a distance of K. We ask whether more or fewer mutations that 
interact show up on the line of descent compared to what is 
available in the population prior to selection. 

The fraction of epistatic pairs on the line of descent in 
the simulations is determined by the fraction of (consecutive) 
substitutions on the LOD for which e 0. Given N, 
and /i, we numerically compute the fraction of all mutational 
pairs that will interact before the mutations are screened by 
selection, by randomly mutating a haplotype and testing if 
any mutations are a distance of K loci or less away from each 
other. The higher mutation rate, the greater the chance that 
a haplotype will be hit by more than one mutation per compu- 
tational update, elevating the fraction above which is 
the expected fraction when only one mutation per generation 
is possible. 

We found that the fraction of epistatic pairs on the 
LOD differs significantly from the fraction available (the pre- 
selection prediction) when the mutation rate is high {/n = 
10~^, Fig. A). Because deleterious mutations enable organ- 
isms to cross valleys between peaks, the LOD is enriched by 
epistatic pairs that include deleterious mutations. Intuitively, 
interacting mutations will be most important if they do not go 
to fixation independently, which occurs as long as the muta- 
tion supply rate N'/j > 1, where JV is the effective population 
size. With our population size of A/" = 5,000, this limit is 
reached for jj, — 10~^ but not for fi — 10~^. When muta- 
tions fix independently, deleterious mutations cannot be res- 
cued by a subsequent beneficial mutation, and can therefore 
not appear on the LOD (Fig. B). As a consequence, epistatic 
pairs of mutations are avoided. There is no significant dif- 
ference between the frequency of selected epistatic pairs com- 
pared to those available at an intermediate mutation rate of 
11 — 10~^. Mutation rates are discussed in more detail in Sup- 
porting Text. At K — most mutational pairs consist of two 
beneficial mutations that do not interact epistatically. The 
exception at high mutation rates (Fig. SI A) is due to rever- 
sals mostly consisting of deleterious-beneficial pairs exhibiting 
positive epistasis (DB^, see Table 1 for the pair nomencla- 
ture), with a small minority of beneficial-deleterious pairs ex- 
hibiting negative epistasis (BD~). At > the distribution 
of mutational pairs changes, because mutations at loci affect- 
ing the same fitness component become more frequent. All 
four types of epistatic mutations (B^, D^, B~, D~) increase 
in frequency at the expense of mutations that do not interact 
epistatically (see Fig. SI). Overall, we observe that as we 
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Fig. 4. Fraction of epistatic pairs on line of descent. The fraction of mutational 
pairs on the LOD that interact epistatically (colored circles) is different from the nu- 
merical pre-selection prediction (black crosses), but depends on the mutation rate. 
Lines are drawn to guide the eye. {A): Blue: /^=10~^, more epistatic pairs on the 
LOD than expected (p=0.0020, Wilcoxon signed rank test). [B): Red: /^^lO"^, 
fewer epistatic pairs on the LOD than expected (p=0.0039). 



Fig. 5. Mean e and substitutions on the line of descent. (A) {e) on the LOD as 
defined by Eq. (4). Each datum is the average of 200 LODs. Error bars are standard 



error. Mutation rates are [1=10 



(blue). 



= 10~^ (green), and /x=10~^ (red). 



Population size is 5,000, Ar=20, and the replacement rate is 10%. Lines are drawn 
to guide the eye. (B) Total number of substitutions (length of the adaptive walk) 
as a function of K, mutation rates and colors as in (A ). 
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increase K, epistasis enables the population to use deleteri- 
ous mutations to adapt more efficiently, as valleys are crossed 
to ascend higher fitness peaks, though this effect is severely 
diminished when the mutation supply rate is low (/xA/" < 1), 
because mutations go to fixation before a second mutation 
occurs. 

Correlation between epistasis and beneficial effect We define 
the size of epistasis per epistatic pair on the LOD as the mean 
e between all consecutive pairs: 

i=l 

where the sum runs over all substitutions on the LOD, Si is 
the size of epistasis of the iih pair [between mutation and 
i on the LOD, given by Eq. (3)], and n is the number of pairs 
(one less than the number of substitutions) . This measure has 
an expectation value of zero if negatively and positively inter- 
acting pairs occur with equal likelihood, and with equal and 
opposite strength, on the LOD. We are studying the mean of 
s per mutational pair in order to compare this measure across 
evolutionary runs that differ in the average number of muta- 
tions on the LOD. We find that s increases with K (Fig. A), 
even at the smallest mutation rate where epistasis is selected 
against (see Fig. B). Higher mutation rates result in larger s 
per mutation on the LOD because the higher rate decreases 
the waiting time for new mutations, making it easier for a 
lineage to cross a valley in the fitness landscape via a delete- 
rious mutation. If a mutation is deleterious, the lineage that 
carries this mutation needs another mutation that at least 
compensates for the fitness loss before it is removed. 

While (e) increases with K, the number of substitutions 
during adaptation decreases (Fig. B), and the fraction of dele- 
terious substitutions is mostly unchanged between low and 
intermediate K (Fig. S2). The origin of the decrease in the 
length of the adaptive walk is clear: for K = 0, mutations 
that increase fitness are not difficult to find because the land- 
scape is smooth. More rugged landscapes risk confining the 
population to local peaks, and even though valleys can be 
crossed towards higher fitness peaks that are close, ultimately 
the ruggedness puts a stop to further adaptation [50]. Even 
though the number of substitutions decreases with adapta- 
tion is more efficient at intermediate K compared to lower K. 
Indeed, the attained fitness, Q (the fitness of the best geno- 
type at the end of a simulation run), increases with K up to 
intermediate values (Fig. 6A). As long a,s K < 5, the attained 
fitness is also an increasing function of (e) (Fig. 6B), that is, 
higher (e) goes hand in hand with higher achieved fitness. 

That adaptation can be more effective with fewer substi- 
tutions seems counterintuitive, but is achieved simultaneously 
by epistasis and pleiotropy. Pleiotropy is one of the main as- 
sumptions behind Fisher's geometric model, and appears to 
be common in nature [59,60]. In the NK model, the increased 
amplitude of the peaks at higher K (see Fig. S3) is caused 
by increased pleiotropy: when each fitness component is de- 
termined by + 1 loci, then in the model it is also true that 
each locus acts pleiotropically, affecting K -\- 1 fitness compo- 
nents. Pleiotropy can thus result in a single mutation increas- 
ing K fitness components at the same time, leading to the 
same fitness increase with fewer mutations. With luck, one 
mutation will increase fitness in all or most of the K com- 
ponents that it affects, amplifying the beneficial effect of the 
mutation. Pleiotropy is therefore directly responsible for the 
increase in potential selection coefficients as a function of K. 
Even though the chance that a mutation will have a positive 
effect on all 1 loci becomes smaller as K increases, the re- 



lationship between fitness increase per substitution is a linear 
function of K (Fig. 7), indicating that each mutation on the 
LOD carries a "bigger punch" as the number of interacting 
loci, K, increases. Besides changing the degree of pleiotropy, 
K also directly modulates epistasis. More epistasis causes 
the frequency of peaks and valleys to increase, which, just as 
pleiotropy, causes increased selection coefficients. Thus, both 
peak frequency and amplitude correlate with and together 
these two cause the increase in average selection coefficients 
for mutations by increasing the slope of the peaks. 

The correlation between the benefit a mutation provides 
and the amount of epistasis s between this and other muta- 
tions, as evidenced by Figs. 7 A and B, mirrors the obser- 
vation of a correlation between directional epistasis and the 
deleterious effects of mutations seen in other computational 
studies of evolution [48,61,62], as well as in protein evolution 
in vitro [63], bacterial evolution [64], and even viroids [65]. 
Because beneficial mutations are rare in most of these stud- 
ies, a correlation between positive effects and epistasis has not 
be shown before. 
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Fig. 6. Attained fitness as a function of ruggedness. Attained fitness, n, as a 
function of K for three different mutation rates (colors and parameters as in Fig. A), 
and as a function of (e) on LOD (B). For clarity, only Kg[0,5] are included in (B). 
Kopt, the point at which Q is maximal, is larger for higher mutation rates. 

Varying the mutation rate does not qualitatively change 
the observed effect of increased epistasis and pleiotropy, which 
reduce the number of substitutions. For all three mutation 
rates examined the attained fitness initially increases with K, 
peaks at = 3 — 5, and then decreases. However, the attained 
fitness level strongly depends on the mutation rate (Fig. 6A). 
For K > and /x = 10~^ a higher fitness is attained, and this 
value decreases monotonically as the mutation rate decreases. 
The magnitude of this effect increases with K, stressing the 
benefit of a high mutation rate in more rugged the landscapes. 
We found no significant effect of mutation rate on attained 
fitness for K = 0, where, apart from reversals, epistasis is 
absent. 

The attained fitness is maximal at = 3 to 5, from 
which we infer that an intermediate amount of epistasis and 
pleiotropy is most conducive to adaptation (Fig. 6A). The 
observed decrease in attained fitness at high K is caused by 
longer waiting times to new mutations (as was shown in [18]) 
which is a consequence of the increasingly rugged structure 
of the NK landscape for high K. As we increase K, the in- 
creased average effect of single mutations (either beneficial or 
deleterious) is counterbalanced by the increasing ruggedness 
of the landscape, which makes it more likely that the pop- 
ulation becomes stuck on a suboptimal fitness peak instead 
of locating the global peak. This lowers the average attained 
fitness of the population compared to lower K. 
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Discussion 

Epistasis between mutations appears to be the norm (even 
for randomly selected loci [36]) and can play a key role in 
shaping evolutionary trajectories. We studied how interacting 
mutations impact the evolutionary dynamics for populations 
evolving in an artificial fitness landscape in which the rugged- 
ness is determined by a single parameter K, in the "strong 
mutation" limit. We found that the more we allow for loci 
to interact by increasing K, the more consecutive mutations 
on the line of descent (LOD) interact, setting the stage for 
both faster adaptation as well as for the discovery of higher 
fitness peaks. Deleterious mutations can go to fixation via 
epistasis (in addition to hitchhiking), and the more rugged a 
landscape is the more common this mode of fixation becomes, 
and the more benefit an organism can derive from deleterious 
mutations. 

We found that increasing the ruggedness of the land- 
scape by raising K has several consequences. First, the mean 
epistatic effect per mutation (s) monotonically increases with 
K (Fig. ). Second, the number of substitutions on the LOD 
decreases, while the fraction of those substitutions that are 
deleterious or beneficial remains largely unchanged (Fig. S2). 
We might intuit that fewer beneficial substitutions impair 
adaptation, but here we instead observe a third effect, namely 
that those higher peaks that appear as K is increased can be 
located more effectively (Fig. 6), because the mean selection 
coefficient per mutation (Fig. 7) increases with K in an ap- 
proximately linear fashion. 




K s 

Fig. 7. Strength of selection coefficients and epistasis. (A): The effect on fitness 
of beneficial, (open symbols), and deleterious substitutions, (solid symbols), 
both increase linearly as a function of K (colors and parameters as in Fig. ). (B): 
Correlation between size of epistasis and effect of mutations, shown here for K=5 
and /j,=10~^. Reversal mutations are excluded because they do not contribute to 
adaptation. Including them would only strengthen the overall correlation. Pearson 
correlation coefficient r=0.3549 for K=5. 

The more rugged an NK fitness landscape is, the faster 
adaptation happens, up to a point. Ruggedness is normally 
viewed as an impediment to adaptation, because the presence 
of valleys means that the organism has to suffer a decrease in 
fitness before it can gain a fitness advantage [18]. However, 
in the NK model, increased ruggedness not only translates 
into more peaks to ascend and more valleys to cross, but also 
increases both the fitness difference between the peaks and 
valleys (amplitude) and the height of the global peak. As K 
increases, the mean height of peaks decreases beyond K = 1 
(Fig. S3) because many more shallow peaks appear than high 
ones. Yet, the global peak height continues to increase beyond 
K = 5 (but note that peaks can never exceed = 1, no mat- 
ter what the K). The attained fitness for /j, = 10~^ follows 
the average global peak height up until K = 5 after which 



it decreases, indicating that the population becomes stuck on 
suboptimal peaks for K > 5. 

As the number of peaks increases with K (thus shortening 
the mutational distance between peaks), the fitness decrease 
that an organism must endure while traversing the valley in 
between the peaks becomes larger. As a consequence, suc- 
cessful lineages must have an increased benefit per substitu- 
tion for higher K. The distribution of single-mutation fitness 
effects becomes broader (Fig. S4), allowing some mutations 
to increase the fitness of the haplotype by as much as a fac- 
tor of i^^ + 1 compared to the K = case. This is an effect 
of pleiotropy (which is inseparable from epistasis in this im- 
plementation of the NK model) and has been documented in 
the observed patterns of pleiotropy in yeast, nematodes, and 
mouse [66]. 

As discussed, the results presented here are a consequence 
of the increased frequency and amplitude of the peaks as well 
as the increase in global peak height (the height of the high- 
est peak) of the landscape as K is increased. It could be ar- 
gued that this increase in the size of the highest peaks (even 
as the mean peak height decreases) is an artifact of the NK 
model that has no counterpart in how biological fitness land- 
scapes change when the number of interactions between genes 
changes. Instead, we believe that the increase in adaptive 
potential is germane, because interacting loci must work syn- 
ergistically to produce higher fitness compared to a set of non- 
interacting loci. In a sense, increasing K creates the opportu- 
nity to find modules of epistatically interacting genes. Indeed, 
searching for epistatically interacting genes is one method to 
search for modules in metabolic genes [67], and a clustering 
method has been used recently to find modules from epistat- 
ically interacting pairs of genes in yeast [35]. Those authors 
found a dependence of the fraction of pairs that are epistatic 
on the size of the deleterious effect of a mutation that mirrors 
the dependence we observe here rather well (see Fig. S5), and 
thus strongly epistatic pairs of mutations provide the largest 
fitness benefit also in yeast. 

For the NK model, we can understand why the global peak 
fitness increases as a direct consequence of the modularity of 
the fitness components: each fitness component Wi is con- 
trolled by K -\- 1 loci, giving 2^^^ possible values. It is more 
likely to find higher fitness values in those larger samples. So, 
just as in the biological pathways with modular structure in 
yeast, the more loci that contribute to a fitness component, 
the better this component can be fine-tuned to optimize its 
contribution to fitness. Given these considerations, we con- 
tend that the NK fitness landscape, obtained from interacting 
loci that synergistically contribute to the function of traits, is 
a reasonable and appropriate model for describing interacting 
gene networks in biological organisms. 

Materials and Methods 

Simulations. We simulated the evolutionary process by randomly removing 10% of 
the population every update, and replacing them with copies of a subset of the remain- 
ders, selected with probabilities proportional to individual fitness. This is akin to the 
Wright-Fisher model for haploid asexuals [68], but with overlapping generations. In 
evolution experiments implemented in flow reactors (for example, continuous culture 
experiments, see [69]), the replacement rate is akin to the flow rate of the reactor. 
Varying the replacement rate does not change the conclusions we reach in this study. 
We define the period of adaptation as beginning at update zero, and ending when the 
lineage first reaches the same fitness that it acquired at the end of the simulation. In 
this manner, we exclude from the analysis reversal mutations (i.e., mutations undoing 
previous mutations at the same locus) that occasionally occur after a fitness peak has 
been ascended. If we included reversals after the highest fitness is attained, both the 
number of deleterious substitutions and the amount of epistasis measured would be 
affected, even though they do not contribute to adaptation. In order to study that 
part of the evolutionary trajectory that corresponds to climbing the nearest fitness 
peak, we choose as the ancestral population a sample of individuals with fitness in 
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the lowest 50% of a randomly generated population where the haplotypes of the in- 
dividuals are uncorrelated. As a consequence, individual lineages may climb different 
peaks (except for K=0, in which case there is only one peak), and the lineage that 
happens to climb the fastest will be most likely to outcompete the other organisms 
in the population. This protocol is similar in spirit to that used in references [2,56], 
where a population of <^>6 viruses was put through bottlenecks in order to study the 
dynamics of re-adaptation. For each mutation rate and for each K, we collected 200 
independent evolutionary runs and extracted one line of descent (LOD, see below) 
from each. In our results, we report the average values across these 200 samples, and 
provide standard errors. The probability of each locus changing its binary value is set 
by a per-site mutation rate /j,. While the average rate of mutation is fixed, the process 
itself is stochastic so that the distribution of the number of mutations per organism is 
Poisson-random with the given mean. We varied K from (no interaction between 
neighboring loci) to 10, where each locus interacts with ten of its neighboring loci. 
Because the haplotypes are circular, for K=10 all mutational pairs interact (100% of 
mutational pairs are epistatic). 



Line of Descent. We record the sequence of mutations that accumulates as popula- 
tions adapt from an initial state of low fitness to the maximum fitness they can attain 
given their environment, by studying a single individual lineage from its inception to 
the end of the simulation run. We do this by picking the most fit organism after a set 
number of simulation updates, and then track this individual's ancestry all the way 
back to the beginning of the simulation. We define this sequence of mutations as the 
line of descent (LOD), and discard all other data from that simulation [14,17]. 

ACKNOWLEDGMENTS. We would like to thank J.E. Barrick, Z.D. Blount, and 
R.E. Lenski for discussions. This work was supported in part by a grant from the 
Cambridge Templeton Consortium, by the National Science Foundation's Frontiers in 
Integrative Biological Research grant FIBR-0527023, and by NSF's BEACON Center 
for Evolution in Action, under contract No. DBI-0939454. 



1. Fisher R (1930) The Genetical Theory of Natural Selection (Oxford University Press, 
Oxford, UK). 

2. Burch CL, Chao L (1999) Evolution by small steps and rugged landscapes in the RNA 
virus *6. Genetics 151:921-927. 

3. Orr HA (2005) The genetic theory of adaptation: A brief history. Nature Reviews 
Genetics 6:119-127. 

4. Gillespie JH (1984) Molecular evolution over the mutational landscape. Evolution 
38:1116-1129. 

5. Gillespie J (1991) The Causes of Molecular Evolution (Oxforn University Press, New 
York, NY). 

6. Orr H (2002) The population genetics of adaptation: The adaptation of DNA se- 
quences. Evolution 56:1317-1330. 

7. Kim Y, Orr HA (2005) Adaptation in sexuals vs. asexuals: Clonal interference and the 
Fisher-Muller model. Genetics 171:1377-86. 

8. Kryazhimskiy S, Tkacik G, Plotkin JB (2009) The dynamics of adaptation on correlated 
fitness landscapes. Proc Natl Acad Sci U S A 106:18638-43. 

9. Desai MM, Fisher DS, Murray AW (2007) The speed of evolution and maintenance 
of variation in asexual populations. Current Biology 17:385-394. 

10. Desai MM, Fisher DS (2007) Beneficial mutation-selection balance and the effect of 
linkage on positive selection. Genetics 176:1759-1798. 

11. Kimura M (1985) The role of compensatory neutral mutations in molecular evolution. 
J Genetics 64:7-19. 

12. Barrick J, Lenski RE (2009) Genome-wide mutational diversity in an evolving popula- 
tion of Escherichia coli. Cold Spring Harb Symp Quant Biol 74:119-129. 

13. Blount ZD, Borland CZ, Lenski RE (2008) Historical contingency and the evolution 
of a key innovation in an experimental population of Escherichia coli. Proceedings of 
the National Academy of Sciences of the United States of America 105:7899-7906. 

14. Lenski RE, Ofria C, Pennock RT, Adami C (2003) The evolutionary origin of complex 
features. Nature 423:139-144. 

15. Bridgham JT, Carroll SM, Thornton JW (2006) Evolution of hormone-receptor com- 
plexity by molecular exploitation. Science 312:97-101. 

16. Poelwijk FJ, Kiviet DJ, Tans SJ (2006) Evolutionary potential of a duplicated 
repressor-operator pair: Simulating pathways using mutation data. PLoS Computa- 
tional Biology 2:467-475. 

17. Cowperthwaite MC, Bull JJ, Meyers LA (2006) From bad to good: Fitness reversals 
and the ascent of deleterious mutations. PLoS Computational Biology 2:1292-1300. 

18. Clune J, et al. (2008) Natural selection fails to optimize mutation rates for long-term 
adaptation on rugged fitness landscapes. PLoS Computational Biology 4:el000187. 

19. Zuckerkandl E, Pauling L (1965) Evolutionary divergence and convergence in proteins. 
In Bryson V, Vogel HJ, eds.. Evolving Genes and Proteins (Academic Press), pp. 
97-166. 

20. Bloom JD, Arnold FH (2009) In the light of directed evolution: Pathways of adaptive 
protein evolution. Proceedings of the National Academy of Sciences of the United 
States of America 106:9995-10000. 

21. Phillips PC (2008) Epistasis - the essential role of gene interactions in the structure 
and evolution of genetic systems. Nature Reviews Genetics 9:855-867. 

22. Weinreich DM, Watson RA, Chao L (2005) Sign epistasis and genetic constraint on 
evolutionary trajectories. Evolution 59:1165-1174. 

23. Poelwijk FJ, Kiviet DJ, Weinreich DM, Tans SJ (2007) Empirical fitness landscapes 
reveal accessible evolutionary paths. Nature 445:383-386. 

24. Reetz MT, Bocola M, Carballeira JD, Zha DX, Vogel A (2005) Expanding the range 
of substrate acceptance of enzymes: Combinatorial active-site saturation test. Ange- 
wandte Chemie-lnternational Edition 44:4192-4196. 

25. Weinreich DM, Chao L (2005) Rapid evolutionary escape by large populations from 
local fitness peaks is likely in nature. Evolution 59:1175-1182. 

26. Weinreich DM, Delaney NF, DePristo MA, HartI DL (2006) Darwinian evolution can 
follow only very few mutational paths to fitter proteins. Science 213:111-114. 

27. Lockless SW, Ranganathan R (1999) Evolutionarily conserved pathways of energetic 
connectivity in protein families. Science 286:295-299. 

28. Whitlock MC, Phillips PC, Moore FBG, Tonsor SJ (1995) Multiple fitness peaks and 
epistasis. Annual Review of Ecoogy and Systematics 26:601-29. 



29. Coyne JA, Barton NH, Turelli M (2000) Is Wright's shifting balance process important 
in evolution? Evolution 54:306-317. 

30. Phillips P, Otto S, Whitlock M (2000) Beyond the average, the evolutionary impor- 
tance of gene interactions and variability of epistatic effects. In Wolf J, Brodie III E, 
Wade M, eds., Epistasis and the Evolutionary Process (Oxford University Press), pp. 
20-38. 

31. Tyler AL, Asselbergs FW, Williams SM, Moore JH (2009) Shadows of complexity: 
what biological networks reveal about epistasis and pleiotropy. Bioessays 31:220-227. 

32. Kelley R, Ideker T (2005) Systematic interpretation of genetic interactions using pro- 
tein networks. Nature Biotechnology 23:561-566. 

33. Ulitsky I, Shamir R (2007) Pathway redundancy and protein essentiality revealed in 
the saccharomyces cerevisiae interaction networks. Molecular Systems Biology 3. 

34. Roguev A, et al. (2008) Conservation and rewiring of functional modules revealed by 
an epistasis map in fission yeast. Science 322:405-410. 

35. Costanzo M, et al. (2010) The genetic landscape of a cell. Science 327:425-31. 

36. Remold SK, Lenski RE (2004) Pervasive joint influence of epistasis and plasticity on 
mutational effects in Escherichia coli. Nature Genetics 36:423-426. 

37. Kauffman S, Levin S (1987) Towards a general theory of adaptive walks on rugged 
landscapes. Journal of Theoretical Biology 128:11-45. 

38. Kauffman SA, Weinberger ED (1989) The NK model of rugged fitness landscapes and 
its application to maturation of the immune response. Journal of Theoretical Biology 
141:211-245. 

39. Kauffman SA (1993) The Origins of Order: Self-Organization and Selection in Evolu- 
tion (Oxford University Press US). 

40. Macken CA, Perelson AS (1989) Protein evolution on rugged landscapes. Proceedings 
of the National Academy of Sciences of the United States of America 86:6191-6195. 

41. Perelson AS, Macken CA (1995) Protein evolution on partially correlated landscapes. 
Proceedings of the National Academy of Sciences of the United States of America 
92:9657-9661. 

42. Solow D, Burnetas A, Roeder T, Greenspan NS (1999) Evolutionary consequences of 
selected locus-specific variations in epistasis and fitness contribution of Kauffman's 
NK model. Journal of Theoretical Biology 196:181-196. 

43. Campos PR, Adami C, Wilke CO (2002) Optimal adaptive performance and delocal- 
ization in NK fitness landscapes. Physica A 304:495-506. 

44. Welch J J, Waxman D (2005) The NK model and population genetics. Journal of 
Theoretical Biology 234:329-340. 

45. Orr HA (2006) The population genetics of adaptation on correlated fitness landscapes: 
The block model. Evolution 60:1113-1124. 

46. Wagner A (1996) Does evolutionary plasticity evolve? Evolution 50:1008-1023. 

47. Ciliberti S, Martin OC, Wagner A (2007) Robustness can evolve gradually in complex 
regulatory gene networks with varying topology. Plos Computational Biology 3:164- 
173. 

48. Azevedo RBR, Lohaus R, Srinivasan S, Dang KK, Burch CL (2006) Sexual reproduc- 
tion selects for robustness and negative epistasis in artificial gene networks. Nature 
440:87-90. 

49. Espinosa-Soto C, Wagner A (2010) Specialization can drive the evolution of modular- 
ity. PLoS Computational Biology 6:el000719. 

50. Weissman DB, Desai MM, Fisher DS, Feldman MW (2009) The rate at which asexual 
populations cross fitness valleys. Theoretical Population Biology 75:286-300. 

51. Solow D, Burnetas A, Tsai M, Greenspan NS (2000) On the expected performance of 
systems with complex interactions among components. Complex Systems 12:423-456. 

52. Bonhoeffer S, Chappey C, Parkin NT, Whitcomb JM, Petropoulos CJ (2004) Evidence 
for positive epistasis in HIV-1. Science 306:1547-1550. 

53. Mani R, St Onge RP, Hartman IV JL, Giaever G, Roth FP (2008) Defining genetic 
interaction. Proceedings of the National Academy of Sciences of the United States of 
America 105:3461-3466. 

54. Elena SF, Lenski R (1997) Test of synergistic interactions among deleterious mutations 
in bacteria. Nature 390:395-397. 

55. Lenski RE, Ofria C, Collier TC, Adami C (1999) Genome complexity, robustness and 
genetic interactions in digital organisms. Nature 400:661-664. 

56. Burch CL, Chao L (2000) Evolvability of an RNA virus is determined by its mutational 
neighbourhood. Nature 406:625-628. 



6 I www.pnas.org/cgi/doi/10.1073/pnas.0709640104 



Footline Author 



57. Wichman HA, Badgett MR, Scott LA, Boulianne CM, Bull JJ (1999) Different tra- 
jectories of parallel evolution during viral adaptation. Science 285:422-424. 

58. Eyre-Walker A, Keightley PD (2007) The distribution of fitness effects of new muta- 
tions. Nature Reviews Genetics 8:610-618. 

59. Ostrowski EA, Rozen DE, Lenski RE (2005) Pleiotropic effects of beneficial mutations 
in Escherichia coli. Evolution 59:2343-2352. 

60. Wagner G, et al. (2008) Pleiotropic scaling of gene effects and the 'cost of complexity'. 
Nature 452:470-U9, cited References Count:26. 

61. Wilke CO, Adami C (2001) Interaction between directional epistasis and average mu- 
tational effects. Proceedings of the Royal Society of London B 268:1469-1474. 

62. Wilke CO, Lenski RE, Adami C (2003) Compensatory mutations cause excess of an- 
tagonistic epistasis in RNA secondary structure folding. BMC Evolutionary Biology 
3:3. 

63. Bershtein S, Segal M, Bekerman R, Tokuriki N, Tawflk DS (2006) Robustness-epistasis 
link shapes the fitness landscape of a randomly drifting protein. Nature 444:929-932. 



64. Beerenwinkel N, Pachter L, Sturmfels B, Elena SF, Lenski RE (2007) Analysis of 
epistatic interactions and fitness landscapes using a new geometric approach. BMC 
Evolutionary Biology 7. 

65. Sanjuan R, Forment J, Elena SF (2006) In silico predicted robustness of viroid RNA 
secondary structures. II. Interaction between mutation pairs. Molecular Biology and 
Evolution 23:2123-2130. 

66. Wang Z, Liao BY, Zhang J (2010) Genomic patterns of pleiotropy and the evolution 
of complexity. Proceedings of the National Academy of Sciences of the United States 
of America www.pnas.org/cgi/doi/10.1073/pnas.1004666107. 

67. Segre D, DeLuna A, Church GM, Kishony R (2005) Modular epistasis in yeast 
metabolism. Nature Genetics 37:77-83. 

68. Donnelly P, Weber N (1985) The Wright-Fisher model with temporally varying selec- 
tion and population size. Journal of Mathematical Biology 22:21-29. 

69. Lindemann BF, Klug C, Schwienhorst A (2002) Evolution of bacteriophage in contin- 
uous culture: a model system to test antiviral gene therapies for the emergence of 
phage escape mutants. Journal of Virology 76:5784-5792. 



Footline Author 



PNAS I Issue Date | Volume | Issue Number | 7 



Table 1. Relationship between positive/negative and synergistic/antagonistic epistasis for 
different mutational pairs. Positive (s > 0) and negative (s < 0) epistasis imply synergis- 
tic/antagonistic if the two mutations are both beneficial or both deleterious, but when the 
mutations are of opposite effect the meaning of synergy or antagony is unclear (dashes). A 
substitution can be characterized by how it interacts with the mutation that precedes it on 
the LOD using the sign of e. Beneficial substitutions are designated or B~, depending 
on whether they interacted epistatically with the preceding substitution to form positive or 
negative epistasis, respectively. and D~ similarly indicate deleterious substitutions with 
positive and negative epistasis. Alternatively, writing BB^ indicates that both substitutions 
increased fitness, and that the second substitution had a larger effect on the background 
of the first. DB~ denotes a deleterious followed by a beneficial substitution that did not 
increase fitness as much as it would have if the deleterious substitution had not occurred. 
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